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ABSTRACT 

With economic globalization and continuous 
development of e-commerce, customer relationship 
management (CRM) has become an important factor 
in growth of a company.CRM requires huge expenses. 
One way to profit from your CRM investment and 
drive better results, is through machine learning. 
Machine learning helps business to manage, 
understand and provide services to customers at 
individual level Both customer segmentation and 
buyer targeting help the business to increase 
marketing performances. 

The objective is to propose a new approach for better 
customer targeting. 
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(CRM) , Machine Learning , Customer Segmentation 
, Customer sTargeting ,K-means algorithm , Smote, 
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I. INTRODUCTION 

CRM requires a big expense in the form of 
Implementation, updates, and training. One way to 
improve ROI and drive better outcomes from the huge 
data available from sales and marketing, customer 
support, is through implementing machine learning on 
top of existing CRM systems. Thus Predictive CRM 
is a system that gathers both internal and external data 
about prospects to predict which accounts are more 
likely to buy. Customer targeting is one of the most 


important components of the customer relationship 
management (CRM) systems. Customer targeting 
identifies promising prospects to increase revenue. 
Improving customer targeting is important for 
reducing overall cost and boost business 
performances. Marketing professionals achieve this 
tasks using a classification method for buyer 
targeting. We will propose a hybrid algorithm that 
will improve Customer Targeting Performance. 

I. 1 Analysis Scenario: 

For identifying prospective customers It is important 
to measure a subject’s “propensity to buy” a particular 
product. We can take advantage of the large amount 
of demographic data to target only those who have the 
highest propensity to buy thus increasing our chance 
of success[9]. 

We will devise a method that exploits the customer 
data in conjunction with the demographic data from 
the overall market population that contains buyer vs. 
non-buyer data, using hybrid algorithms to increase 
customer targeting by improving classification 
performance. 

II. Related Work 

One of the key problem in CRM is buyer targeting, 
that is, to identify the prospects that are most likely to 
become customers. Marketers are applying data 
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mining tools to solve the problem, such as in [1] the 
authors focused on classification of online customers 
based on their online website behaviors, and [2] 
applied neural networks guided by genetic algorithms 
to target households. [3] proposed a new feature 
selection technique .In this work the classification 
performance of C4.5 Decision Tree , NaiveBayes 
classifier, SVM classifier and KNN classifier was 
compared, SVM classifier is found working best with 
this methodology. [4] proposed a hybrid algorithm that 
uses the concept of clustering and decision tree 
induction to classify the data samples. This approach 
solves issues of burdening decision tree with large 
datasets by dividing the data samples into clusters. In 
[5] the author suggested a customer classification and 
prediction model in commercial bank that uses 
collected information of customers as inputs to make 
a prediction for credit card proposing. She 
implemented Naive Bayesian classifier algorithm[8] 
developed individually tailored predictive models for 
each segment to maximize targeting accuracy in the 
direct-mail industry. In such a step-by-step approach, 
the buyer targeting (the second step) becomes 
dependent on the results of customer segmentation 
(the first step). However, the customer segmentation 
has to be implemented independently. [9] proposed to 
first use K-Means clustering to segment customers 
and then build the segment-wise predictive models for 
better targeting the promising customers. In [12] 
customer segmentation and buyer targeting as a 
unified optimization problem was formulated as a 
single problem. The integrated approach not only 
improves the buyer targeting performances but also 
provides a new perspective of segmentation based on 
the buying decision preferences of the customers. A 
new K-Classifiers Segmentation algorithm was 
developed to solve the unified optimization problem. 

III. Algorithms 

The algorithm k-means was used for customer 
targeting[12] and for feature selection in [4],SMOTE 
algorithm was used for handling class imbalance in 
[11].Logistic Regression was used as a benc hm ark for 
the comparative analysis of RFM and FRAC methods 
in [13].This algorithms are elaborated in detail. 

A. K means Clustering:- 

The real life datasets has multiple number of 
features[4] .Grouping these features on the basis of 
similarity is required. Clustering is an unsupervised 
method of separating a large number of data into 


subsets of similar characteristics. Different clustering 
methods can generate different groupings for same set 
of data samples. Clustering can be broadly classified 
as partition based and hierarchical based. Some 
examples of the techniques used for partition based 
clustering are k-means and k medeoids. The algorithm 
proposed in this paper uses k-means algorithm for 
feature selection. 

B. Logistic Regression 

In the logistic regression model, the predicted values 
for the dependent variable will always be greater than 
(or equal to) 0, or less than (or equal to) 1. [10]. 

The name logistic stems from the fact that one can 
easily linearize this model via the logistic 
transformation. Suppose we think of the binary 
dependent variable y in terms of an underlying 
continuous probability p , ranging from 0 to 1. We can 
then transform that probability p as: 


Logistic regression is very useful for several reasons: 
(1) logistic modeling is conceptually simple; (2) easy 
to interpret as compared to other methods like ANN 
(3) logistic modeling has been shown to provide good 
and robust results in comparison studies[6].For 
database marketing applications, it has been shown by 
several authors [7]that logistic modeling may 
outperform more sophisticated methods. 

C. SMOTE:- 

An over-sampling approach in which the minority 
class is over-sampled by creating “synthetic” 

examples rather than by over-sampling with 

replacement^, 11],The minority class is over-sampled 
by taking each minority class sample and introducing 
synthetic examples along the line segments joining 
any or all of the k minority class nearest 
neighbors[ll,2]. Depending upon the amount of over- 
sampling required, neighbors from the k nearest 
neighbors are randomly chosen[2], 

VI. CONCLUSION 

Previous research mainly focus on providing a general 
predictive model for the total customer base, .Our 

work in this paper is an attempt to unify the 

supervised and the unsupervised learning methods for 
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better customer targeting.Is it possible to get 
maximum ROI ,by using hybrid algorithms to exploit 
buyer vs non-buyer customer data from the overall 
market population for customer targeting by 
improving the classification performance? This is the 
research question we would like to answer. A case 
study on a real world marketing data will be used 
for evaluating the the performance of the 
proposed approach. 
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